A few months ago, a friend of mine told me about her desire to better track a virtual tutoring program that she manages. I asked her to let me make a dashboard for her, and she agreed. To make the dashboard, I used R and the Flexdashboard package and came up with this as (a replica of) the final product. Below, I’ll outline the steps I took to come up with it.

Step 1: Access & load data

All the data is input and hosted on Google Sheets, so to access it I used the googlesheets4 package along with a host of others. This was my first introduction to APIs (application programming interface), which allows two applications to talk to each other (in this case Google and R) and gain access to information that is otherwise private.

It took me a while, but I was finally able to figure out the API stuff. For this post, I’ve created “fake” data in Google Sheets and made it publicly viewable. (This simplifies the process). However, I’ve commented out the code you’d need in real life to access your private Google Sheets data.

library(googlesheets4) # access Google sheets data
library(flexdashboard) # dashboard package, great alternative to Shiny
library(shiny) # I still wanted to use some Shiny functionalities
library(DT) # nice html data tables
library(lubridate) # manipulating date data
library(plotly) # easy-to-code interactive plots
library(janitor) # cleaning names
library(tidyverse) # pipes, data wrangling, etc

# options(
#  gargle_oauth_cache = ".secrets",
#  gargle_oauth_email = TRUE)

# gs4_auth(scope = "https://www.googleapis.com/auth/drive")
# drive_auth(token = gs4_token())

gs4_deauth() # asking Google for permission to access data # gives R permission
             # publicly (not private) viewable Googlesheets only

gs_tutoring <- "1vyVNz2ClxDVcPr7oa1LTc0YyHwZ53DI_lBSvKtl3v8M" # Googlesheet data

Step 2: Tidy the data

Data rarely comes in the form that you need it for proper analysis and visualizations. You often have to manipulate from a “wide” format into a “long” (aka “tidy”) format. After your data is in the right format, you transform it to your heart’s desire.

{{< panelset class=“greetings” >}} {{< panel name=“Not tidy (How it’s given) :thumbsdown:” >}}

## # A tibble: 28 x 60
##       id Status Orientation Day     Time     `Tutor Name` `Tutor Phone` `Tutor Unexcuse~` `Tutor Unexcus~` `Subject Tutor~` `Student Name` `Student Unexc~`
##    <int> <chr>  <chr>       <chr>   <chr>    <chr>        <chr>         <chr>             <chr>            <chr>            <chr>          <chr>           
##  1     1 Active AB          Monday  4:30 PM  Tutor 1      <NA>          1                 0                Honors Biology   Student 1      0               
##  2     2 Active ZR          Monday  4:45 PM  Tutor 2      <NA>          0                 1                AP Calculus      Student 2      1               
##  3     3 Active DG          Monday  5:00 PM  Tutor 3      <NA>          2                 1                Earth Science    Student 3      0               
##  4     4 Active LJ          Monday  5:30 PM  Tutor 4      <NA>          0                 0                Math 2           Student 4      0               
##  5     5 Active AB          Monday  5:30 PM  Tutor 5      <NA>          0                 1                Math 2           Student 5      1               
##  6     6 Active ZR          Monday  5:30 PM  Tutor 6      <NA>          0                 0                AP Chemistry     Student 6      1               
##  7     7 Active DG          Monday  6:00 PM  Tutor 7      <NA>          0                 0                IB History       Student 7      0               
##  8     8 Active AB          Monday  6:00 PM  Tutor 8      <NA>          0                 0                AP Biology       Student 8      0               
##  9     9 Active ZR          Monday  6:00 PM  Tutor 9      <NA>          0                 0                AP Calculus      Student 9      0               
## 10    10 Active DG          Tuesday 11:00 AM Tutor 10     <NA>          0                 1                Chemistry        Student 10     1               
## # ... with 18 more rows, and 48 more variables: `Student Unexcused Absences` <chr>, Cohort <chr>, Counselor <chr>, `Start Date` <date>,
## #   `Projected End Date` <date>, `Actual End Date` <date>, `2021-09-20` <chr>, `2021-09-27` <chr>, `2021-10-04` <chr>, `2021-10-11` <chr>,
## #   `2021-10-18` <chr>, `2021-10-25` <chr>, `2021-11-01` <chr>, `2021-11-08` <chr>, `2021-11-15` <chr>, `2021-11-22` <chr>, `2021-11-29` <chr>,
## #   `2021-12-06` <chr>, `2021-12-13` <chr>, `2022-01-03` <chr>, `2022-01-10` <chr>, `2022-01-17` <chr>, `2022-01-24` <chr>, `2022-01-31` <chr>,
## #   `2022-02-07` <chr>, `2022-02-14` <chr>, `2022-02-21` <chr>, `2022-02-28` <chr>, `2022-03-07` <chr>, `2022-03-14` <chr>, `2022-03-21` <chr>,
## #   `2022-03-28` <chr>, `2022-04-04` <chr>, `2022-04-11` <chr>, `2022-04-18` <chr>, `2022-04-25` <chr>, `2022-05-02` <chr>, `2022-05-09` <chr>,
## #   `2022-05-16` <chr>, `2022-05-23` <chr>, `2022-05-30` <chr>, `2022-06-06` <chr>, `Total Sessions Missed` <dbl>, `Total Sessions  Held` <dbl>, ...

{{< /panel >}} {{< panel name=“Tidy (How we want it) :thumbsup:” >}}

df_attendance_long
## # A tibble: 1,008 x 27
##       id Status Orientation week_of    tutoring_date Day    Time    `Tutor Name` `Tutor Phone` `Tutor Unexcused Tardies` `Tutor Unexcuse~` `Subject Tutor~`
##    <int> <chr>  <chr>       <date>     <date>        <chr>  <chr>   <chr>        <chr>         <chr>                     <chr>             <chr>           
##  1     1 Active AB          2021-09-20 2021-09-20    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  2     1 Active AB          2021-09-27 2021-09-27    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  3     1 Active AB          2021-10-04 2021-10-04    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  4     1 Active AB          2021-10-11 2021-10-11    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  5     1 Active AB          2021-10-18 2021-10-18    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  6     1 Active AB          2021-10-25 2021-10-25    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  7     1 Active AB          2021-11-01 2021-11-01    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  8     1 Active AB          2021-11-08 2021-11-08    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
##  9     1 Active AB          2021-11-15 2021-11-15    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
## 10     1 Active AB          2021-11-22 2021-11-22    Monday 4:30 PM Tutor 1      <NA>          1                         0                 Honors Biology  
## # ... with 998 more rows, and 15 more variables: `Student Name` <chr>, `Student Unexcused Tardies` <chr>, `Student Unexcused Absences` <chr>,
## #   Cohort <chr>, Counselor <chr>, `Start Date` <date>, `Projected End Date` <date>, `Actual End Date` <date>, `Total Sessions Missed` <dbl>,
## #   `Total Sessions  Held` <dbl>, `Total Sessions Possible` <dbl>, `Attendance Percentage` <dbl>, day_numeric <dbl>, `Attendance Proportion` <dbl>,
## #   attendance_code <chr>

{{< /panel >}} {{< panel name=“Tidy (How we want it) :thumbsup:” >}}

gs_attendance_col_types <-
  "ccccccccccccccDDDccccccccccccccccccccccccccccccccccccccccccccnnnn"

df_attendance <-
  read_sheet(gs_tutoring, 
             range = "A9:BM",
             n_max = 100,
             col_types = gs_attendance_col_types)%>% 
  select(-c("8/9", "8/16", "8/23", "8/30", "9/6", "9/13", "12/20", "12/27")) %>%
  mutate(id = row_number(), .before = "Status") %>%
  mutate(Day = case_when(
    Day == "MON" ~ "Monday",
    Day == "TUE" ~ "Tuesday",
    Day == "WED" ~ "Wednesday",
    Day == "TH" ~ "Thursday",
    Day == "F" ~ "Friday")) %>% 
  mutate(day_numeric = case_when(
    Day == "Monday" ~ 2,
    Day == "Tuesday" ~ 3,
    Day == "Wednesday" ~ 4,
    Day == "Thursday" ~ 5,
    Day == "Friday" ~ 6)) %>%
  mutate(`Attendance Percentage` = round(`Attendance Percentage` , digits = 2) * 100,
         `Attendance Proportion` = `Attendance Percentage` / 100) %>% 
  # What's a better way to do this? Regex?
  rename("2021-09-20" = "9/20",
         "2021-09-27" = "9/27",
         "2021-10-04" = "10/4",
         "2021-10-11" = "10/11",
         "2021-10-18" = "10/18",
         "2021-10-25" = "10/25",
         "2021-11-01" = "11/1",
         "2021-11-08" = "11/8",
         "2021-11-15" = "11/15",
         "2021-11-22" = "11/22",
         "2021-11-29" = "11/29",
         "2021-12-06" = "12/6",
         "2021-12-13" = "12/13",
         "2022-01-03" = "1/3",
         "2022-01-10" = "1/10",
         "2022-01-17" = "1/17",
         "2022-01-24" = "1/24",
         "2022-01-31" = "1/31",
         "2022-02-07" = "2/7",
         "2022-02-14" = "2/14",
         "2022-02-21" = "2/21",
         "2022-02-28" = "2/28",
         "2022-03-07" = "3/7",
         "2022-03-14" = "3/14",
         "2022-03-21" = "3/21",
         "2022-03-28" = "3/28",
         "2022-04-04" = "4/4",
         "2022-04-11" = "4/11",
         "2022-04-18" = "4/18",
         "2022-04-25" = "4/25",
         "2022-05-02" = "5/2",
         "2022-05-09" = "5/9",
         "2022-05-16" = "5/16",
         "2022-05-23" = "5/23",
         "2022-05-30" = "5/30",
         "2022-06-06" = "6/6")

df_attendance_long <-  df_attendance %>%
  pivot_longer(cols = matches("^[0-9]{4}-[0-9]{2}-[0-9]{2}$"),
               names_to = "week_of",
               values_to = "attendance_code") %>%
  mutate(week_of = ymd(week_of)) %>%
  mutate(attendance_code = case_when(attendance_code == "x" ~ "X", 
                                   TRUE ~ as.character(attendance_code))) %>% 
  mutate(tutoring_date = case_when(
    Day == "Monday" ~ as_date(week_of) + 0,
    Day == "Tuesday" ~ as_date(week_of) + 1,
    Day == "Wednesday" ~ as_date(week_of) + 2,
    Day == "Thursday" ~ as_date(week_of) + 3,
    Day == "Friday" ~ as_date(week_of) + 4), .before = "Day")%>% 
  relocate(week_of, .after = "Orientation")

{{< /panel >}} {{< /panelset >}}

Step 3: Visualize the data

Now that the data’s in the proper format, it’s time time to visualize it! The Flexdashboard package has all sorts of neat functions like value boxes and gauges that are great for progress monitoring.

n_total_sessions <- sum(df_attendance_long$attendance_code == "1",  
                      df_attendance_long$attendance_code == "2",
                      df_attendance_long$attendance_code == "3", na.rm = TRUE)

valueBox(
  n_total_sessions,
  icon = "fa-pencil",
  href = gs_tutoring
)

156